WELFake: Word Embedding Over Linguistic Features for Fake News Detection
نویسندگان
چکیده
Social media is a popular medium for the dissemination of real-time news all over world. Easy and quick information proliferation one reasons its popularity. An extensive number users with different age groups, gender, societal beliefs are engaged in social websites. Despite these favorable aspects, significant disadvantage comes form fake news, as people usually read share without caring about genuineness. Therefore, it imperative to research methods authentication news. To address this issue, article proposes two-phase benchmark model named WELFake based on word embedding (WE) linguistic features detection using machine learning classification. The first phase preprocesses data set validates veracity content by features. second merges feature sets WE applies voting validate approach, also carefully designs novel approximately 72 000 articles, which incorporates generate an unbiased classification output. Experimental results show that categorizes real 96.73% improves overall accuracy 1.31% compared bidirectional encoder representations from transformer (BERT) 4.25% convolutional neural network (CNN) models. Our frequency-based focused analyzing writing patterns outperforms predictive-based related works implemented Word2vec method up 1.73%.
منابع مشابه
Exploiting Tri-Relationship for Fake News Detection
Social media for news consumption is becoming popular nowadays. The low cost, easy access and rapid information dissemination of social media bring benefits for people to seek out news timely. However, it also causes the widespread of fake news, i.e., low-quality news pieces that are intentionally fabricated. The fake news brings about several negative effects on individual consumers, news ecos...
متن کاملAutomatic Detection of Fake News
The proliferation of misleading information in everyday access media outlets such as social media feeds, news blogs, and online newspapers have made it challenging to identify trustworthy news sources, thus increasing the need for computational tools able to provide insights into the reliability of online content. In this paper, we focus on the automatic identification of fake content in online...
متن کاملStance Detection for Fake News Identification
The latest election cycle generated sobering examples of the threat that fake news poses to democracy. Primarily disseminated by hyper-partisan media outlets, fake news proved capable of becoming viral sensations that can dominate social media and influence elections. To address this problem, we begin with stance detection, which is a first step towards identifying fake news. The goal of this p...
متن کاملAre Word Embedding-based Features Useful for Sarcasm Detection?
This paper makes a simple increment to state-ofthe-art in sarcasm detection research. Existing approaches are unable to capture subtle forms of context incongruity which lies at the heart of sarcasm. We explore if prior work can be enhanced using semantic similarity/discordance between word embeddings. We augment word embedding-based features to four feature sets reported in the past. We also e...
متن کاملECNU: Multi-level Sentiment Analysis on Twitter Using Traditional Linguistic Features and Word Embedding Features
This paper reports our submission to task 10 (Sentiment Analysis on Tweet, SAT) (Rosenthal et al., 2015) in SemEval 2015 , which contains five subtasks, i.e., contextual polarity disambiguation (subtask A: expressionlevel), message polarity classification (subtask B: message-level), topic-based message polarity classification and detecting trends towards a topic (subtask C and D: topic-level), ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Computational Social Systems
سال: 2021
ISSN: ['2373-7476', '2329-924X']
DOI: https://doi.org/10.1109/tcss.2021.3068519